Parallelize CI script to improve run time#4357
Parallelize CI script to improve run time#4357joostjager wants to merge 2 commits intolightningdevkit:mainfrom
Conversation
|
👋 Thanks for assigning @tnull as a reviewer! |
ci/ci-tests.sh
Outdated
| } | ||
|
|
||
| # Initialize GitHub Actions Job Summary | ||
| if [ -n "$GITHUB_STEP_SUMMARY" ]; then |
There was a problem hiding this comment.
Given this is just nice printing should we always turn it on rather than only for CI?
There was a problem hiding this comment.
We can do that, but in which file would you want to save the output? The file is, during CI, in $GITHUB_STEP_SUMMARY
There was a problem hiding this comment.
Oh I just skimmed it and thought it printed didn't write to a file.
There was a problem hiding this comment.
Wasn't happy with the way github does the step summary. Now pushed a different approach that just splits up the workflow in more granular steps.
There was a problem hiding this comment.
Bleh, no, ci-tests.sh exists so that we're not tied to github and so that people can run it locally. Splitting it into a million bits breaks those existing goals.
There was a problem hiding this comment.
You can still run ci-tests.sh
13792ed to
03aa9f1
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4357 +/- ##
==========================================
- Coverage 86.06% 85.90% -0.17%
==========================================
Files 156 156
Lines 103623 103958 +335
Branches 103623 103958 +335
==========================================
+ Hits 89183 89303 +120
- Misses 11924 12137 +213
- Partials 2516 2518 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Not happy with this yet, exploring other options. |
4f74e2d to
6f524c7
Compare
TheBlueMatt
left a comment
There was a problem hiding this comment.
I'm still not really sold on this - we're gonna end up missing a step that we add in CI or something and is more logic in ci-tests.sh really helpful - how often are you actually looking at individual steps to understand the performance profile of ci-tests.sh?
|
It's not so much about performance profile anymore. That was what I had initially. The current shape of it is to make it easy to see which test failed and jump to log files. It's friction that I experience with one big log file. Adding things in CI is not something that happens very often. I think it is worth doing this change. If you add to CI and follow the pattern, it seems unlikely that something is forgotten. |
|
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
tnull
left a comment
There was a problem hiding this comment.
Concept ACK.
I like the idea as it makes it easier to just run a specific subset of tests/checks, while still maintaining backwards compat.
Breaking the CI tests into individual steps might also be a good first step before we move some of them into a nightly job eventually?
ci/ci-tests.sh
Outdated
| workspace-tests | ||
| ldk-upgrade-tests | ||
| workspace-member-checks | ||
| lightning-dnssec |
There was a problem hiding this comment.
Can we make the crate-related step naming consistent? Currently some are using just an abbreviated prefix (tx-sync, block-sync), while others like this spell out the full crate name. If we intend this to be used by devs regularly, would be good to make the step names predictable, so that we don't have to always look up what the exact step is called.
There was a problem hiding this comment.
Yes, good one, done.
Also renamed step IDs to be action-oriented (e.g., test-lightning-block-sync instead of lightning-block-sync-tests), matching how the echo statements and GitHub step names already use verbs like "Testing..." or "Checking...".
e08a604 to
8a620a5
Compare
But even with more discrete steps you're still scrolling up until you find the
Doesn't this happen automatically (after github finishes loading....) I'm really not happy seeing a laundry list of steps copied between the bash script and github config - these are inevitably going to get out of sync and we'll end up having steps skipped in CI. |
Right, I don't disagree, but this doesn't fix that? There are still multiple commands run in many steps and you still have to scoll up and find the one that failed. #4368 should improve things a bit, but if you actually want to fix this problem probably we need something in |
|
Agreed that #4368 improves it a bit, but I still think that also having these github-native sections is better. Even if they sometimes group multiple commands. |
|
Worth noting that this also helps when waiting for CI to complete, which can take a while. With the current single-step approach there's no way to tell how far along a run is. You just see it running. With split steps you can see which step is currently executing, giving a sense of progress. GitHub also shows timing per step, which makes it easier to identify where optimization efforts would have the most impact. |
Hmm, I see the point about things potentially getting out-of-sync at some point. No strong opinion either way, but even if we end up not doing the CI changes, can we at least keep the splittin of |
|
The out of sync risk is addressed in the extra commit that I pushed where completion of all steps is verified. I also think that step selection can be useful in CI other than for reporting purposes. Currently our CI is painfully slow, and there are probably build matrix / step combinations that can for example be moved to a nightly job. |
c5aa8da to
d566539
Compare
|
Rebased onto |
tnull
left a comment
There was a problem hiding this comment.
Generally still makes sense to me, especially the part that allows to locally run a sub-set of the CI script. Changes look good to me, maybe mod one nit.
Seems we might need to discuss this offline then to make progress here.
ci/ci-tests.sh
Outdated
|
|
||
| # Log the completed step to GITHUB_ENV for the verification step | ||
| if [ -n "$GITHUB_ENV" ]; then | ||
| echo "CI_COMPLETED_STEPS=${CI_COMPLETED_STEPS:-} $STEP" >> "$GITHUB_ENV" |
There was a problem hiding this comment.
nit: It's a bit weird to modify the external env variable here.
There was a problem hiding this comment.
do you have a suggestion how to do it otherwise? We need to write the modified CI_COMPLETED_STEPS var to the github file in order for it to carry over to the next step, and ultimately to the verification step that checks that we didn't forget anything.
6d79246 to
ac73165
Compare
|
The build matrix is now dynamic. A new Adding the For branch protection, require |
ac73165 to
e1385d1
Compare
.github/workflows/build.yml
Outdated
| - name: Determine matrix and scope | ||
| id: config | ||
| run: | | ||
| FULL_MATRIX='{"include":[{"platform":"self-hosted","toolchain":"stable"},{"platform":"self-hosted","toolchain":"beta"},{"platform":"self-hosted","toolchain":"1.75.0"},{"platform":"windows-latest","toolchain":"stable"},{"platform":"macos-latest","toolchain":"stable"},{"platform":"macos-latest","toolchain":"1.75.0"}]}' |
There was a problem hiding this comment.
Sheesh can we just have a separate yml file for nightly jobs?
There was a problem hiding this comment.
Both concepts aren't mutually exclusive. I thought just doing limited scope testing for PRs that are in progress is useful too. Of course if we move the whole build matrix to the nightly, this is unnecessary, yes.
.github/workflows/build.yml
Outdated
| - name: Check and build docs for workspace members | ||
| shell: bash | ||
| run: CI_ENV=1 CI_MINIMIZE_DISK_USAGE=1 ./ci/ci-tests.sh check-workspace-members | ||
| # --- Label-gated: block-sync --- |
There was a problem hiding this comment.
Looks like these are currently not being run on the normal build.
There was a problem hiding this comment.
This PR demonstrates using labels to select CI test subsets, with a full run gated on merge.
A normal PR push only runs core steps. The other groups (block-sync, no-std, c-bindings, cfg-flags) are skipped unless you add the corresponding label. What is a "normal run"? Intentionally just the core. The idea is to deload CI for everyday pushes. Before merging, you add the full-test label to run the complete suite.
ci/ci-tests.sh
Outdated
| WORKSPACE_MEMBERS=( $(cat Cargo.toml | tr '\n' '\r' | sed 's/\r //g' | tr '\r' '\n' | grep '^members =' | sed 's/members.*=.*\[//' | tr -d '"' | tr ',' ' ') ) | ||
|
|
||
| # Verify that all steps were executed (called at the end of CI) | ||
| if [ "$1" = "--verify-complete" ]; then |
There was a problem hiding this comment.
Meh come on, originally the stated purpose of this PR was to make it easier to find out what failed. It didn't really accomplish that and after the --quiet change (well, and a few more warning fixups that we need to do?) this isn't really an issue anymore (you do need to scroll from the top, but the entire CI run is only a few pages of output so that's fast). Now we have a bunch of complexity to validate steps are all run and for what?
Now we want to see when CI passes...something? Personally I run ci-tests.sh locally and just wait for the first cargo test to complete, which is quite obvious from the output so I'm not really sure what the goal here is.
I'm also pretty dubious the "we want to split CI for local testing" goal is accomplished - the steps here are clearly not conducive to that, if the goal is local subset testing we should probably group tests by useful tags (not just individual steps) so that we can do things like "run all tests testing no-std" or "run all tests on lightning-transaction-sync", even though there's overlap of those. But more generally I'm not quite sure what the exact use-case here is - what do we want to accomplish that cargo test -p lightning-transaction-sync doesn't?
There was a problem hiding this comment.
Fair points on the earlier motivations. --quiet addresses most of the failure-finding friction, and local subset testing isn't the main driver.
The remaining motivation is CI load. Every push currently runs the full suite across all matrix combinations. The step splitting enables gating: core tests on every push, heavier groups only when relevant via labels, full suite required pre-merge via full-test. The --verify-complete step catches any drift between the script and the workflow.
If a separate nightly yml is preferred for the heavier jobs, that works too, but then we lose the ability to run them on-demand for a specific PR via labels.
To be clear, this PR has become exploratory at this point. It was a concrete merge proposal when it just split steps, but has evolved into a proof of concept for label-gated CI.
e1385d1 to
c011c0b
Compare
|
Abandoned the granular steps approach. Since all steps need to run for every PR anyway, the real win is parallelism, not granularity. This force-push replaces the previous approach with 4 parallel jobs via a reusable workflow. Wall-clock time should go down from ~150 min to ~49 min. First run timings for the
|
c011c0b to
c2c0315
Compare
|
Rebalanced jobs a bit. Longest job now ~32 mins |
.github/workflows/build.yml
Outdated
| setup-bitcoind: true | ||
| ci-env: true | ||
|
|
||
| build-complete: |
There was a problem hiding this comment.
What's the point of this? Won't the individual jobs report errors?
There was a problem hiding this comment.
This was just to have a single gating condition. Will remove.
.github/workflows/ci-build.yml
Outdated
| install-arm-target: | ||
| description: Whether to install ARM embedded target | ||
| type: boolean | ||
| default: false | ||
| setup-bitcoind: | ||
| description: Whether to set up bitcoind/electrs | ||
| type: boolean | ||
| default: false | ||
| ci-env: | ||
| description: Whether to set CI_ENV=1 | ||
| type: boolean | ||
| default: false |
There was a problem hiding this comment.
Why not just always do these things?
There was a problem hiding this comment.
To save time if a sub CI script doesn't need it.
There was a problem hiding this comment.
Remove ci-env, unnecessary
There was a problem hiding this comment.
It should only take a few seconds to load the bitcoind/electrs files from cache/remote host? Can we just avoid the complexity?
There was a problem hiding this comment.
Shellcheck is a static analysis tool for shell scripts and belongs with the other linting checks rather than in the build matrix. This also prepares for splitting the build job into parallel sub-jobs, where running shellcheck in each sub-job would be redundant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
55cda17 to
ee1c933
Compare
|
Moved the shellcheck step from the build job to the linting job in a preparatory commit, since it doesn't pertain to any particular sub-script and fits better alongside the other lint checks. The previous 4-job split had a significant imbalance (22m to 51m). This 6-job split groups steps by logical concern rather than trying to balance by time, and still achieves a better range (20m to 37m) because the two heaviest jobs got split along natural seams. Job timings from the latest CI run:
Total time across all jobs is 163m vs ~144m for the original monolithic script, so about 13% overhead from duplicated setup/compilation. Verified correctness by concatenating all sub-scripts (stripping per-file boilerplate) into a single file and diffing against the original script. Since steps moved between groups, the diff initially shows many moved blocks. To confirm nothing was lost or added, I manually reordered the blocks in the concatenated file to match the original order and verified the diff was empty. Open question: with 6 jobs each doing a fresh build, the |
TheBlueMatt
left a comment
There was a problem hiding this comment.
Aside from the comment about just re-fetching bitcoind this LGTM. I don't see a reason to avoid the 10-second cost for more complexity in CI scripts and jobs.
|
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
Split the monolithic ci-tests.sh into six focused scripts that run as separate parallel jobs via a reusable workflow (ci-build.yml): - ci-tests-workspace.sh: workspace checks, tests, docs, and downstream compat - ci-tests-features.sh: crate feature/flag combinations (dnssec, tokio, backtrace, test vectors, serde) - ci-tests-bindings.sh: c_bindings builds and tests - ci-tests-nostd.sh: no_std builds and compatibility checks - ci-tests-cfg-flags.sh: experimental gated cfg flags (taproot, simple_close, lsps1_service, peer_storage) - ci-tests-sync.sh: block sync and transaction sync clients Shared setup (MSRV pins, backtrace) is extracted into ci-tests-common.sh. The ci-tests.sh script now delegates to the sub-scripts for local use. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ee1c933 to
a026ace
Compare
TheBlueMatt
left a comment
There was a problem hiding this comment.
We'll have to monitor total usage to see if it creeps back up and we have to revert this to reduce load, but easy enough for now.

Split the CI
buildjob in multiple parallel running parts. Longest job runs for ~32 mins.